Peng Mi (mipeng@vt.edu)
(Primary Contact)
Yong Cao (yongcao@vt.edu)
Virginia Polytechnic Institute and State University
Student Team: No
Animated Visualization Toolkit
(AVIST) is a GPU-accelerated visualization tool, which features real-time
animated visualization of streaming datasets and multiple coordinated views.
The tool is developed at Virginia Tech. AVIST utilizes the parallel computing
capacity of GPUs for visualizing and analyzing large datasets. Based on the
parallel algorithms of geometry and rendering data generating, AVIST can
provide real time visual analytics of millions of data records.
AVIST provides four coordinated
views: histogram view, parallel coordinate view, dynamic view, and graph view.
It also supports three different complex disjunctive data filters: highlight
filters, exclusive filters and negative exclusive filters. The
combination of the four coordinated views and three disjunctive-normal-form
(DNF) filters can easily help users to identify any patterns and unexpected
dynamic events from large datasets. At last, AVIST supports time-synced
visualization of multiple datasets, such as the IPS, network flow and computer
status datasets in this challenge.
May we post your submission in the Visual Analytics Benchmark
Repository after VAST Challenge 2013 is complete? Yes
Video:
http://people.cs.vt.edu/mipeng/vast_2013/vt-avist-mc3.wmv
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Questions
MC3.1
Provide a timeline (i.e., events organized in chronological order) of the
notable events that occur in Big Marketings computer networks for the two weeks
of supplied data. Use all data at your disposal to identify up to twelve events
and describe them to the extent possible.
Your answer should be no more than 1000 words long and may contain up to
twelve images.
Week
One DataSet
Event 1:
Situation: In the network flow dataset, we identify there
are records whose ipLayerProtocalCode are OTHER,
and they indicated the firstSeenSrcIp 172.10.0.6
using firstSeenScrPort 0 to scan many destination
IPs using firstSeenDestPort 0.
Time: during
the whole Week-One 07:50, 4/1/2013~ 5:51, 4/7/2013
Event 2
Situation:
The computer with firstSeenSrcIp 172.10.0.6 used
its port 1984 (Big Brother Network Monitor ports) to scanned the Big Market
Internet. Meanwhile, this computer also used its port 0 to scan the network. In
the following picture, the green is about the port 1984, the red is the port 0.
Time: During
the whole Week One.
Event 3
The computer
with a unknown firstSeenDestIp
10.3.1.25 periodically scanned the Email Servers of the Big Marketing network.
Interestingly, it sometimes scanned the three email servers all together, sometimes it only scanned one or two email
servers.
Event 4
Situation: A very interest picture of the Network Health
and status data. The information of the service name is not mixed; while they
have some patterns. First the information of warning is the major part (the
green), then there is no warning for a period. Then
lots of errors (the blue) are coming. Other colors represents for different
types of service name. The warnings began from 09:05, 4/1,2013 to 06:50,
4/2/2013 which were associated to the service name disk; and the errors
began from 00:42, 4/3/21013 to 18:47, 4/3/2013 and their service name were
conn.
Event 5
Situation: the Web Service of Branch Three (the computer
firstSeenSrcIp 172.30.0.4) scanned the network of a
set of unknown firstSeenDestIp from 10.6.XX to 10.206.XX using port 80. The scanning took
place at three different time periods, and all three periods have the same time
range of two and half hours.
Event 6
Situation:
After the breakdown of the network, the computer with firstSeenDestIp
239.255.255.250 sent lots of UDP packages from its port 1900 (Microsoft SSDP Enables discovery of UPnP devices) to enable the whole network between 3:26, 4/3/2013 and
6:57, 4/3/2013. In the following days, this computer also sent this information
three times after the network was broken down.
Event 7:
Situation:
the webmail servers 172.30.0.3, 172.20.0.3 used their 80 port simultaneously
scanned the network. (The graph view is
generated by the items of firstSeenSrcIp and firstSeenSrcPort using force directed algorithm, we see
that each source ip has its unique source port, so we
assume that it should be the webmail servers scanned the network, otherwise the
computers of source ips should have connected by
their ports.)
Time: From
3:30, 4/3/2013 ~ 6:50, 4/3/2013
Event 8
Situation:
The computer with firstSeenSrcIp 10.9.81.5 is very
suspicious. During the time between 09:30, 4/3/2013 and 11:25, 4/3/2013 it was
scanned by web server of Enterprise Site 3(computer of firstSeenDestIp
172.30.0.4), then between 9:27, 4/6/2013 and 3:19, 4/7/2013 it scanned
computers with firstSeenDestIp 172.10.0.4,
172.10.0.5, 172.10.0.9, 172.20.0.6 .
Event 9
The computer
with firstSeenSrcIp 172.20.0.15 scanned the
computers of fistSeenDestIp 10.6.6.7, 10.12.15.152,
10.15.7.05, 10.70.68.127, 10.250.178.101 from its port 80 between 9:30,
4/3/2013 and 07:06, 7/6/2013.
Week Two
Event 10
In the
Network Health and Status Data of Week Two, the network servers of DC, Email,
Web and DNS reported problems all the times, and their numProcs,
localAveragePercent and physicalMEmoryUsage
are all empty.
Event 11
Situation:
The servers of the DC, Email, Web and DNS were periodically accessed using the
port 3389 from the computers whose IPs are 10.6.6.7, 10.12.15.152, 10.13.XX.XX . These computers wanted to remotely control
these servers. The red color shows the inbound of the accessing, and the green
colors direction is empty.
Event 12
The major
part of the IPS Dataset of Week Two is the warnings, which are also
periodically appeared. These warnings are the records for TCP protocol. The
following table describes this event.
Color |
Source
IP |
Source
Port |
Destination
IP |
Red
|
10.12.15.152 |
37551
37552 |
10.0.2.2~10.0.2.8 |
Green |
10.6.6.7 |
46396
46397 |
10.0.4.2 |
Blue |
10.17.15.10 |
40598
40599 |
10.0.2.2 |
Purple |
10.12.14.15 |
51447
51448 |
10.0.3.2~10.0.3.5 |
Cyan |
10.13.77.49 |
61699
61700 |
10.0.2.2 |
MC3.2
Speculate on one or more narratives that describe the events on the network.
Provide a list of analytic hypotheses and/or unanswered questions about the
notable events. In other words, if you were to hand off your timeline to an
analyst who will conduct further investigation, what confirmations and/or
answers would you like to see in their report back to you? Your answer should
be no more than 300 words long and may contain up to three additional images.
For the Week
One Dataset, our first hypothesis that the webmail servers are infected in the
beginning or had been already infected. The following image shows that before
the network had been down, the webmail servers were very active.
The second
hypothesis is about the Router, unusual events happened during 05:35, 4/3/2013
~ 05:36, 4/3/2013. In this minute, there are 4435845 records. So we use the graph view to see the
connections between the source IPs and Destination IPs. In this image, we see 172.10.0.6 is a hub of
lots of other nodes, and 172.0.0.1 is also another hub.
The third
hypothesis is about the relationship between week One and week Two. In week two
the network flow dataset is very small compared with week one. So our
hypothesis is that the network problems in week one were
still in week Two and got even worse. The following image is the
synchronization of the network flow dataset and IPS Dataset. The data from the
two datasets has the similar patterns, while the IPS data has more information
than network flow data. So we can infer the reason is the logging servers of
the network are damaged.
MC3.3
Describe the role that your visual analytics played in enabling discovery of
the notable events in MC3.1. Describe whether your visual analytics play a role
in formulating the questions in MC3.2. Your answer should be no more than 300
words long and may contain up to three additional images.
AVIST is a
very effective and efficient tool. After users load the data, they can use the
filters to remove the common connections such as the accessing of 80, 21 ports,
and play the animation. Normally the users can use histogram view and control panel
to find the IPs or Ports for filtering. The following image shows how to remove
port 80 from the visualized data views.
Uses can
find some common scanning from the parallel coordinate view, when playing
animation. The following image show the animations when we remove the UDP and
other protocols and port 80. Then users
can use the highlight filter to see the distribution of the records in the all
dataset when combing parallel coordinate view and dynamic view. In the
following image we can easily focus on source ip
172.10.0.6, and we can also try different colors for the comparison and pattern
finding.
Users can
make hypothesis or verify their previous findings using AVIST. In event 2, we
find the IP 172.10.0.6 use the Big Brother Network Monitor ports. It is true that
172.10.0.6 only has connections with these two ports, what about other ports?
We use AVIST
to selected the internet recording with 172.10.0.6 and without port 1984 and 0.
AVIST tells us that 172.10.0.6 still has connections with other ports.